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g) A ON A signal sequence initially Isolated from Streptomyces griseus encodes a signal peptide which directs 
the secretion, via a fused intermediate, of a protein from the cell within which the DNA signal sequence is 
expressed. The signal sequence is derived from genes encoding protease A and protease B of S. griseus. The 
DNA signal sequence encodes a thirty-eight amino acid signal peptide. A DNA construct, including the DNA 
signal sequence and a gene sequence encoding a protein, when transformed into a living cell by a suitable 
vector, results in the signal peptide correctly directing the secretion of a mature protein of desired structure, 
particularly from prokaryotic genera selected for their ability to display enzymatic activity of a type typified by, 
but not exlusive to. that of protein disulphide oxidoreductase, EC 5.3.4.1. more particularly in the genera 
Streptomyces . and most particularly in Streptomyces lividans 66. 
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CHARACTERIZATION AND STRUCTURE OF GENES FOR PROTEASE A AND PROTEASE B FROM STREP- 

TOMYCES GRISEUS 



FIELD OF THE INVENTION 

This invention relates to a biofogically pure DNA signal sequence which encodes an amino acid signal 
5 peptide necessary for directing the secretion from certain defined hosts of proteins in bioactive form. 

BACKGROUND OF THE INVENTION ► 

10 In the biological production of commercially viable proteins by the fermentation of microorganisms, the 
ability to produce the desired proteins by fermentation with secretion of the proteins by the microorganisms 
into the broth is very significant. However, there are many commercially viable proteins encoded by 
genetically engineered DNA constructs which are not secreted by the cells in which the DNA is expressed. 
This often necessitates harvesting the cells, bursting the cell walls, recovering the desired proteins in pure 

75 form and then chemically re-naturing the pure material to restore its bioactive function. This downstream 
processing, as it is called, is illustrated In Figure 1. 

Some cells and microorganisms carry out the biological equivalent of downstream processing by 
secreting proteins rn bioactive form. The mechanism which directs the secretion of some proteins through 
the cell walls is not fully understood. For example, in Streptomyces griseus . an organism used for the 

20 commercial production of Pronase, the species secretes many extra cellular proteins (Jurasek. L, P. 
Johnson, R.W. Olafson. and L.B. Smillie (1971), An improved fractionation system for pronase on CM- 
Sephadex Can. J. Biochem., 49:1195-1201). Protease A and protease B. two of the serine proteases 
secreted by griseus . have sequences which are 61% homologous on the basis of amino acid identity . 
(Fujinaga. M.. LT.J. Delbaere, G.D. Brayer, and M.N.Q. James (1985).Refined structure of orlytic protease 

25 ^ 1.7A resolution;' Analysis of hyrodgen bonding and solvent structure . J. Mol. Biol.. 183:479-502: Jurasek, 
L, M,R. Carpenter. LB. Smillie. A. Gertler, S. Levy, and L.H. Ericsson (1974). Amino acid sequencing of 
Streptomyces griseus protease B. A major component of pronase . Biochem. Biophys. Res. Comm., 
61:1095-1100; Young, C.L.. W.G. Barker. CM. Tomaselli. and M.O. Dayhoff (1978). Serine proteases . ]n 
M.O. Dayhoff (ed.). Atlas of Protein Sequence and Stnjcture 5, suppl. 3:73-93). These proteases also have 

30 similar tertiary structure, as determined by X-ray crystallography (Delbaere. L.T.J.. W.L.B. Hutcheon, M.N.G. 
James, and W.E. Thiessen (1975). Tertiary structural differences between microbial serine proteases and 
pancreatic serine enzymes . Nature 257:758-763; Fujinaga. M., L.T.J. Delbaere. G.D. Brayer. and M.N.G. 
James (1985), Refined structure of g-lytic protease at 1^ A resolution; Analysis of hyrodgen bonding and 
solvent structure . J. Mol. Biol., 183:479-502; James, M.N.G., A.R. Sielecki. G.D. Brayer, L.T.J. Delbaere. and 

35 C.-A. Bauer (1980), Structures of product and inhibitor complexes of Streptomyces griseus protease A at 
1.8. A resolution . J. Mol. BioL, 144:43-88). Although the structures of proteases A and B have been 
extensively studied, the genes encoding these proteases have not been characterized before. 



40 SUMMARY OF THE INVENTION 

In accordance with this invention, the genes encoding protease A and protease 8 of S. griseus have 
been isolated and investigated to reveal DNA sequences which each direct the secretion of an encoded 
protein fused either directly or indirectly to a signal peptide encoded by the DNA. 
45 According to an aspect of the invention, a recombinant DNA sequence comprises a signal sequence 
and a gene sequence encoding a protein. The recombinant DNA sequence, when expressed in a living cell, 
encodes an amino acid signal peptide with the protein. The signal peptide directs secretion of the protein 
from a cell within which the DNA signal sequence is expressed. 

According to another aspect of the invention, a biologically pure isolated DNA signal sequence encodes 
so a 38 amino acid signal peptide which directs secretion of a recombinant gene-sourced protein linked to 
such 38 amino acid signal peptide, from a cell in which the DNA signal sequence is expressed. The DNA 
signal sequence is isolated from Streptomyces griseus . 

According to another aspect of the invention, the DNA signal sequence in conjunction with a gene 
sequence encoding a protein is inserted into a vector, such as a plasmid or a phage. 

2 
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According to another aspect of the invention, the DN A signal sequence is adapted for expression In a 
living cell having enzymes catalyzing the formation of disulphide bonds. 

According to another aspect of the invention, the biologically pure isolated DNA signal sequence of 
Figure 4a, 

According to another aspect of the invention, the biologically pure isolated DNA signal sequence of 
Figure 5a. 

According to another aspect of the invention, a fused protein is encoded by the recombinant DNA 
sequence of Figure 4 or Figure 5. 

According to another aspect of the invention, a transformed prokaryotic cell is provided which has 
inserted therein a suitable vector including the recombinant DNA encoding the signal protein. The 
transformed prokaryotic cell may be selected from the Streptomyces genera. 

According to another aspect of the invention, a biologically pure culture has a transformed prokaryotic 
cell with the recombinant DNA sequence in a suitable vector. The culture is capable of producing, as an 
intermediate, the fused protein of the amino acid signal peptide and the protein. The protein itself is 
produced in a recoverable quantity upon fermentation of the transformed cell in an aqueous nutrient 
medium. The signal peptide directs secretion of the protein from the cell. 

According to another aspect of the invention, a biologically pure culture, transformed with the functional 
signal sequence as described above, is able to direct the secretion from the cell of proteins whose 
bioactivity is dependent upon the formation of correctly positioned intramolecular disulphide bonds. 

A biologically pure DNA sequence encoding a fused protein including protease A has the combined 
DNA sequence of Rgures 4a, 4b and 4c. 

A biologically pure DNA sequence encoding a fused protein including protease B has the combined 
DNA sequence of Rgures 5a, 5b and 5c. . 

BRIEF DESCRIPTION OF THE DRAWINGS 

With reference to the Rgures. a variety of short forms have been used to identify restriction 
endonucleases, amino acids, deoxyrbonucleic acids and related information. Standard nomenclature has 
been used in identifying all of these components as are readily appreciated by those skilled in the art. 

Preferred embodiments of the invention are described with respect to the drawings, wherein: 
Rgure 1 illustrates downstream processing; 

Rgure 2 shows restriction endonuclease maps of DNA fragments of sprA and sgrB; 
Rgure 3 illustrates restriction endonuclease maps and sequencing strategies In sequencing DNA 
fragments containing sprA and sprB; 

Rgure 4 is the DNA sequence of sprA; 

Rgure 4a is the DNA sequence encoding the sprA (protease A) signal peptide; 
Rgure 4b is the DNA sequence encoding the sprA (Protease A) propeptide; 
Rgure 4c is the DNA sequence encoding mature protease A; 
Rgure 5 is the DNA sequence of sprB; 

Rgure 5a is the DNA sequence encoding the sgrB (protease B) signal peptide: 
Rgure 5b is the DNA sequence encoding the spr B (protease B) propeptide; 
Rgure 5c is the DNA sequence encoding mature protease B; 

Rgure 6 is an alignment of the amino acid sequences deduced from sprA and spr B to develop 
homology between the two sequences. 

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS 

The organism Streptomyces griseus is a well recognized microorganism. It is commercially used for the 
production of the enzyme Pronase. It is appreciated, however, that this organism also secretes two 
enzymes, protease A and protease B, which are both serine proteases. Although the structure of proteases 
A and B have been extensively studied, the genes encoding these proteins, and the manner in which this 
genetic information is used to signal secretion by the cells, is not understood. According to this invention, 
the genes which encode protease A and protease B and provide for the secretion of these proteins in 
bioactive fonm have been discovered. It has been determined that each of protease A and 8 is included in a 
precursor protein which is processed to remove an amino-terminal polypeptide portion from the mature 
protease. It has further been determined that each of protease A and B precursor proteins is enzymatically 
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processed to form correctly-positioned intramolecular disulphide bonds, which processing is concomitant 
with removal of the amino terminal addressing peptide from the mature precursor. The discovered genes, 
which encode proteases A and B. their intermediate address-competent forms, and their control elements, 
have been designated sprA and sprB« 

5 As discussed in the following articles. Jurasek. L.. M.R. Carpenter, UB. SmiKie, A. Gertler. S. Levy, and 

LH. Ericsson (1974), Amino acid sequencing of Streptomyces griseus protease a major component of 
pronase .« Biochem. Biophys. Res. Comm. 61:1095-1100; Young. C.L, W.C. Barker, CM. Tomaselli, and 
M.O. Dayhoff (1978), Serine proteases . In M,0. Dayhoff (ed.). Atlas of Protein Sequence and Structure 5. 
suppL 3:73-93. proteases A and B are homologous proteins containing several segments of identical amino 

70 acid sequence. In accordance with this Invention, the genetic code, which makes and directs the secretion 
of each of proteases A and B, has identical DNA sequences corresponding to the regions of identicality for 
the homologous proteins proteases A and B. In order to isolate the genes, this assumption, that identicality 
in portions of the gene sequences would occur, was made so that an oligonucleotide probe could be 
designed from one of the similar regions in the sequences. 

75 In order to extrapolate the gene sequence which would encode the similar amino acid sequence, the 
known codon bias for Streptomyces was relied upon to develop the nucleotide probe (see Bernan, V., D. 
Filpula. W. Herber, M. Bibb, and E. Katz (1985). The nucleotide sequence of gje tyrosinase gene from 
Streptomyces antibiotics and characterization of the gene product . Gene 37:101-1 10; Bibb, M.J., M.J. Bibb. 
J M Ward, S.N. Cohen (1985). Nucleotide sequerices encoding and promoting expression of three antibiotic 

20 resistance genes indigenous to Streptomyces. . Mol. Gen. Genet. 199:26-36: (Thompson. C.J.. and G.S. 
Gray (1983), Nucleotide sequence of a streptomycete aminoglycoside phosphotransferase gene and its 
relationship to phosphotransferases encoded by resistance plasmids . Proc. Natl. Acad. Sci. USA 80:5 ISO- 
SI 94). Once the probe was constructed, it was then possible to probe the DNA sequences of S. griseus to 
determine if there were any corresponding nucleic acid sequences in the microorganism. Since it was 

25 known that there were two proteases. A and B. the oligonucleotide probe should have revealed two DNA 
fragments detected by hybridization analysis, and in fact, not only did the probe hybridize equally to two 
fragments generated in the genomic library of griseus, but also two fragments generated by BamHI 
digest (8.4 kb and 6.8 kb) or B^lll (1 1 kb and 2.8 kb) were isolated from the genomic library. As a cross- 
check with respect to the predictability of such probe, the same fragments were detected in genomic DNA 

30 libraries of other isolates of S^ griseus . It was noted, however, that there was no such hybridization of the 
oligonucleotide probe with DNA from other Streptomyces such as S. lividans . 

Plasmids were constructed containing digested fragments of S. griseus The oligonucleotide probe was 
used to isolate developed plasmids containing sprA and SprB. The screening by use of the probe was 
accomplished by colony blot hybridization where approximately 15.000 E colt transformants containing the 

35 developed plasmids were screened. Twelve transformants were detected by the prot^e and isolated for 
further characterization. These colonies contained two distinct classes of plasmid based on restriction 
analysis. As determined from the hybridization of genomic DNA, the plasmids contained either the 6.8 kb or 
the 8.4 kb Bam HI fragment. These fragments contained the Spr A and SprB genes. 

The fragments as isolated by hybridization screening were tested for the expression of proteolytic 

40 activity. With these plasmids identified, such characterization may be accomplished in accordance with a 
variety of known techniques in accordance with a preferred embodiment of this invention. 

The 6.8 kb and 8.4 kb Bam HI fragments were ligated into the Bglll site of the vector plJ702. 
Transformants of S. lividans containing these constructions were tested on a milk plate for secretion of 
proteases. A clear zone, which represented the degradation of the milk proteins, surrounded each 

45 transformant that contained either BamHI fragment It was noted that the clear zones were not found around 
S. lividans colonies which contained either plJ702 only or no plasmid construct. 

Proteolytic activity was also observed when the Bam HI fragments were cloned in either orientation witii 
respect to the vector, thereby minimizing the possibility of read-through transcription of an incomplete 
protease gene. This observation provides evidence that the two Bam HI fragments contain an intact protease 

50 gene which is capable of effecting secretion in a different Streptomyces species, as for example the S. 
lividans . With this particularly relevant characterization of the Bam HI fragment, and knowing that the desired 
gene was in these fragments, it was possible to isolate and to sequence the genes encoding protease A 
and protease B. 

According to a preferred aspect of this invention, the particular protease gene contained within each 
55 cloned Bam HI fragment was determined by dideoxy sequencing of the plasmids using the oligonucleotide 
probe as a primer in such analysis. The 8.4 kb BamHI fragment was found to contain sprB, because a 
polypeptide deduced from the DNA sequence matched a unique segment of the known amino acid 
sequence of protease B. The 6.8 kb BamHI fragment contained the sprA by process of elimination. The 

4 
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protease'genes in these fragments were localized by digesting the plasmids and determining which of the 
restriction fragments of the plasmids were capable of hybridizing to the oligonucleotide probe. 

Figure 2 shows detailed restriction maps of the 6.8 kb and 8.4 kb Bam HI fragments. Hybridization to the 
oligonucleotide probe was confined to a 0.9 kb Pvull-StuI fragment of spr A and a 0.6 kb Pvull-Pvul fragment 
5 of sprt B. Such hybridization Is indicated by the heavy lines in Figure 2. Hybridization to the cloned Bam HI 
fragments and the 2.8 kb Bglll fragment of spr B agrees with the hybridization to Bam HI and Bglll fragments 
of genomic DNA. Thus, rearrangment of the Bam HI fragments containing the protease genes is unlikely to 
have occurred. 

The functional portions of the sprA- an sprB-containing DNA were determined by subcloning restriction 

10 fragments thereof into plJ702. The constructed plasmids were transfomied into S, lividans and tested for 
proteolytic activity. The 3.2 kb BamHI- Bgl ll fragment of S£rA and the 2.8 kb Bglll fragment of S£rB. when 
subcloned into plJ702 in either orientation, resulted in the secretion of protease from lividans . The intact 
protease genes were further delimited to a 1.9 kb StuI fragment for s£rA and a 1.4 kb BssHII fragment for 
sprB. With reference to Rgure 2, each of these functionally active subclones are indicated below the 

IS restriction maps which contain the region for each gene which hybridized to the oligonucleotide probe. 

In order to determine the nucleic acid sequence of the protease genes, the 3.2 kb Bam HI-Bglli fragment 
of sprA and the 2.8 kb Bglll fragment of spr B were subcloned into pUClB to facilitate further structural 
characterization. As shown in Rgure 3. the restriction maps of these subclones and the strategies which 
were used to sequence the 1 .4 kb Sail fragment containing sprA and the 1.4 kb BssHII fragment containing 

20 Spr B are shown. The resultant DNA sequences of sprA and §grB are shown in Figures 4 and 5. 
respectively. The predicted amino acid sequence of protease A differed from the published sequence by 
the amidatlon of amino acid 133, whereas that of protease B was identical to the published sequence, (see 
Fujinaga. M., LT.J. Delbaere, G.D. Brayer, and M.N.G. James (1985), Refined structure of a-Iytic protease 
at 1 .7 A resolution ; Analysis of hyrodgen bonding and solvent structure , J. MoL Biol. 183:479-502). 

25 Analyzing the sequences of Figures 4 and 5, it is apparent that each sequence contains a large open 
reading frame with the coding region of the mature protease situated at the 3' end. For the protease A and 
protease B genes, the sequence encoding the carboxy-terminus of the protease is followed immediately by 
a translation stop codon. At the other end of the sequence, the predicted amino acid sequences appear to 
extend beyond the amino-termini of the mature proteases A and B by an additional 116 amino acids for 

30 SprA of Figure 4 and 114 amino acids for sprS of Rgure 5, The putative GTG initiation codons at each of 
these positions (-116 for Rgure 4; -114 for Figure 5) are each preceded by a potential ribosome binding site 
(as indicated by the series of five dots above the sequence) and followed by a sequence which encodes a 
signal peptide. The processing site for the signal peptidase (identified by the light arrow in Figures 4 and 5) 
is predicted at 38 amino acids from the amino-terminus of the putative precursor. [For clarity, that part of 

35 the nucleic acid sequences of Rgures 4 and 5 corresponding to the signal peptide portion of sgrA and sgrB 
Is reproduced in Rgures 4A and 5A, respectively]. The propeptide is encoded by the remaining sequence 
between the signal processing site (light arrow) and the start of the mature protein (indicated at the dark 
arrow). [For clarity, that part of nucleic acid sequences of Rgures 4 and 5 corresponding to the propeptide 
portion of sprA and sprB is reproduced in Rgures 4B and 5B,. respectively]. The mature protease is 

40 encoded by the codon sequence 1 through 181 for Rgure 4 and 1 through 185 for Rgure 5. [For clarity, 
that part of the nucleic acid sequences of Figures 4 and 5 corresponding to the mature protein portion of 
sprA and sprB is reproduced in Figures 4C and 5C. respectively]. The amino acid sequence for codons 
-116 through +181 of Rgure 4 and the amino acid sequence for codons -114 through + 185 of Rgure 5, 
when made in the living cell S. griseus , are acted upon in a manner to produce in the culture medium 

45 externally of the living cells the mature bioactive enzymes protease A and protease B. The processing 
involved in accordance with the contained information encoded by that portion of the gene from start of the 
promoter to start of the mature protein in each case included providing a secretory address, the correct 
signal peptide processing site, the necessary propeptide structure not only for secretion but also for correct 
disulphide bond formation concomitant with secretion, and competent secretion in bioactive form. 

50 In accordance with this invention, the ability of the signal peptide to direct the secretion of bioactive 
protein was established by inserting known DNA sequences at the beginning and at the end of known 
sequences. For example, consider the sequence shown in Figure 5. In particular, the promoter and initiator 
ATG of the aminoglycoside phosphotransferase gene, (Thompson, C.J., and G.S. Gray (1983), Nucleotide 
sequence of a streptomycete aminoglycoside phosphotransferase gene and its relationship to phosphotran- 

55 sferases encoded b^ resistance plasmids, Proc. Natl. Acad, Sci. USA. 80:5190-5194) had been inserted 
preceding the second codon (AGQ at -113) of the signal sequence of Rgure 5. Due to the insertion of this 
new promoter and initiator, the spr B gene, now under the control of this non-native promoter, directed both 
elevated levels and earlier expression of proteolytic activity when compared with the unaltered sgrB gene. 

5 
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The secretion of bioactlve protease B In this construction indicated that nucleic acid sequences preceding 
the GTG initiation codon at -114 are not required for the correct secretion of the protease B in bioactlve 
form, provided an active and competent promoter is placed in the precise location indicated. 

In order further to demonstrate the universality of the discovered signal peptide, the spr B coding region 

5 was replaced with a gene sequence encoding the mature amylase from S. griseus . Hence the nucleic acid 
sequence encoding the amylase was inserted In place of the sequence of Figure 5 to the right of the light 
arrow. It was determined that the resulting genetic construction directed the production of an extracellular 
protein having an N-terminai alanine, properly positioned intramolecular disulphide bonds, and exhibiting 
amylolytic activity at a level comparable to that of a similar construction with the natural signal peptide of 

10 amylase. In accordance with this invention, the 38 amino acid signal peptide of Figures 4 and 4A and 5 and 
5 A is sufficient to direct the secretion of non-native protein in bioactive form. 

Since both signal sequences encode for the signal peptides of Figures 4 and 4A and 5 and 5A. the 
organization of the coding regions of sprA and spr B were investigated by comparing the amino acid 
homology of the encoded peptide sequences. Such comparisons are set out in Rgure 6 where amino acid 

15 homology has been compared for the signal peptide of Figure 6a. the propeptide of Figure 6b and the 
mature protease of Figure 6c. A summary of such homology is provided in the following Table I. 

TABLE I 



Homology of spr A and spr B Coding Regions 




Length 


Protein 


DNA 




(codons) 


Homology % 


Homology % 


Signal 


38 


50 


56 


Propeptide 


79 


43 


62 


NT protease* 


87 


45 


58 


CT protease^ 


103 


75 


75 


Total Protease 


190 


61 


67 


Total coding region 


307 


55 


65 



a amino-termtni of mature proteases (amino acids 1-87} 
b carboxy-tenmini of mature proteases (amino acids 88-190) 



The alignment of amino acid sequences translated from the coding regions of the sprA and spr B genes 
indicates an overall homology of 54% on the basis of amino acid identity. As indicated in Table I. the 
sequence homology is not unifomnly distributed throughout the coding region of the spr A and spr B genes. 
The carboxy-terminal domains of the proteases A and B are 75% homologous as noted under the heading 
"CT protease" whereas the average homology for the remainder of the coding region is only 45%, indicated 
under the heading "NT protease". The amino terminal domains containing the signal and propeptide 
regions were similar in both extent of homology and distribution of consensus sequences, as indicated 
under the headings "signal" and "propeptide". The unexpectedly high DNA sequence homology relative to 
that of the protein sequences is particularly due to the 61% conservation in the third position of each codon 
of the sequence. These investigations, revealing the close homology between sprA and spr B genes, 
suggest that both genes originated by duplication of a common ancestral gene. With appropriate care and 
investigation, the commonality of the signal peptides can be determined, thus establishing the cue for 
secretion of proteins and hence providing sufficient information to construct, from the signal DNA of sprA 
and sprB. a single nucleic acid sequence which will be competent to direct protein secretion. 

In accordance with the invention, a recombinant DNA sequence can be developed which encodes for 
desired protein where the expressed protein, in conjunction with the signal peptide and optionally the 
propeptide, provide for secretion of the desired protein in bioactive form. The recombinant DNA sequence 
may be inserted in a suitable vector for transforming a desired cell for manufacturing the protein. Suitable 
expression vectors may include plasmids and viral phages. As is appreciated by those skilled in the art, the 
bioactivity of secretory proteins is assured by establishing the correct configuration of intramolecular 
disulphide bonds. Thus, suitable prokaryotic hosts may be selected for their ability to display enzymatic 
activity of a type typified by. but not limited to, that of protein disulphide oxidoreductase. EC 5.3,4.1. 

The particular protein encoded by the recombinant DNA seqence may include eukaryotic secretory 
enzymes, such as prochymosin. chymotrypsin. trypsins, amylases, ligninases. chymosin. elastases. lipases. 
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and cellulases; prokaryotic secretory enzymes such as glucose, isomerase. amylases, lipases, pectinases. 
cellulases. proteinases, oxidases, lignises; blood factors, such as Factor VIII and Factor IX and Factor VIII- 
related biosynthetic blood coagulant proteins; tissue-type plasminogen activator; hormones, sucli as 
proinsulin: lymphokines, such as beta and gamma-interferon, and interleukin-2; enzyme inhibitors, such as 

5 extracellular proteins whose action is to destroy antibiotics either enzymatically or by binding, for exannple, 
a B-lactamase inhibitor, a-trypsin inhibitor; growth factors, such as organism or nerve growth factors, 
epidermal growth factors, tumor necrosis factors, colony stimulating factors; immunoglobulin-related mol- 
ecules, such as synthetic, designed, or engineered antibody molecules; cell receptors, such as cholesterol 
receptor; viral molecules, such as viral hemaglutinins. AIDS antigen and immunogen, hepatitis B antigen 

w and immunogen, foot-and-mouth disease virus antigen and immunogen; bacterial surface effectors, such as 
protein A; toxins such as protein insecticides, algicides. fungicides, and biocides; and systemic proteins of 
medical importance, such as myocardial infarct protein (MIP). weight control factor (WCF). calloric rate 
protein (CRP) and hirutin (HRD). 

One skilled in the art can easily determine whether the use of any known or unknown organism will be 

75 within the scope of this invention in accordance with the above discussion and the following examples. 

Microorganisms which may be useful in this respect as potential prokaryotic expression hosts include: 
Order: 

Actinomycetales ; Family: Actinomycetaceae Genus: Matruchonema . Lactophera : Family Actinobacteria : 

Genus Actinomyces, Agromyces . Arachina. Arcanobacterium . Arthrobacter . Brevibacterium . Cellulomonas . 
20 Curtobacterium . Microbacterlum . Oerskovla . Promicromonospora . Renibacterium . Rothia; Family Ac^ 

tinoplanetes: Genus Actinoplanes . Dactylosporangium, Micromonospora : Family Nocardioform ac^ 

tinomycetes : Genus Caseobacter . Corynebacterium . Mycobacterium . Nocardia , Rhodococcus ; Family Strep- 

tomycetes : Genus Streptomyces , Streptoverticiliium ; Family Maduromycetes : Genus Actinomadura . Excel- 

lospora . Microspore . Planospora , Spirillospora, Streptosporangtum ; Family Thermospora : Genus Actinosyn- 
25 nema . Nocardiopsis . Thermophilla ; Family Microspora: Genus Actionospora . Saccharospora ; Family Ther- 

moactlnomycetes: Genus Thermoactinomyces : and the other prokaryotic genera: Acetivlbrio , Acetobacter , 

Achromobacter. Acinetobacter . Aeromonas , Bacterionema , Bifidobacterium . Ravobacterium . Kurthia . Lac- 
tobacillus . Leuconostoc, Myxobacteria, Propionibacterium . 

The following species from the genus Streptomyces are identified as particularly suitable as hosts: 
30 acidophillus . albus. amyiolyticus , argentiolus . aureofaciens , aureus , candidus . cellostaticus . cellulolyticus, 

coelicolor . creamorus . diastaticus . farinosus . fiaveolus , flavogriseus . fradiae , fulvoviridis. fungictdicus , 

qelaticus , olaucescens . gtobisporus . griseoius . griseus . hygroscopicus . ligninolyticus . lipolyticus , lividans . 

moderatus . olivochromogenus. parvus , phaeochromogenes . plicatus . proteolyticus. rectus , roseolus. 

roseoviolaceus . scabies , thermolytlcus . tumorstaticus . venezuelae . violaceus. violaceus-ruber , violascens, 
35 and viridochromogenes. acrimycini 

alboniger 

ambofaciens 

antibioticus 

aspergilioides 
40 chartreusis 

ciavuligerus 

diastatochromogenes 

echinatus 

erythraeus 
45 fendae 

griseofuscus 

kanamyceticus 

kasugaensis 

koganeiensis 
50 lavendulae 

parvulus 

peucetius 

reticuli 

rimosus 
55 vinaceus 

Also, the following eukaryotic hosts are potentially useful in the practice of this invention: 
Absidia , Acremonium , Acrophialopora. Acrospeira , Alternaria . Arthrobotrys . Ascotricha. Aureobasidium . 
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Beauveria . Bispora , Bjerkandera , Caiocera , Candida, Cephaltophora , CephaIosporium > Cerinomyces , 
Chaetomium . Chrysosporium , Circineila , Ciadosporium . Ciiomastix , Coccospora . Cochnoboius . Cunnin* 
ghameiia , Curvuiaria , Custingophara . Dacrymyces . Dacryopinax , Dendryphion . Pictosporium . Doratomyces . 
Drechslera . Eupenlciilium , Flammulina t Fusarium . Gliocladrum . GMomnastix, Graphium . Hansenuia , 

5 Humicola . Hyaiodendron , Isaria, Kloeckera . Kiuyveromyces , Lipomyces . Mammaria , Merulius , Microascus . 
Monodictys . Monosporium . Morchelia , Mortierella . Mucor , Myceiiophthora , Mycrothecium , Neurospora . 
Oedocephalum . Qidiodendron , Pachysoien . Papularia , Papulaspora , Peniciilium , Pentophora . Periconia , 
Phaeocoriolelius . Phanerochaete , Phiaiophora , Piptocephaiis , Pleurotus . Preussia , Pycnoporus , Rhion- 
cladteiia , Rhizomucor . Rhizopus . Rhodotorula , Robiliarda . Saccharomyces , Schwanniomyces , 

10 Scolecabasidium . Scopuiariopsis , Scytalidium , Stachybotrys , Tetracluium , Thamnidium , Thermioascus , 
Thermomyces . Thiciavia . Tolypociadium . Torula . Torulopsis , Trametes , Trtceiiula , "^richocladium , 
Trichoderma . Trichurus , Truncatelia , Uiocladium . Ustilago , Verriculiium , Wardomyces . Xylogone , Yarrowia. 

Preferred embodiments of the invention are exemplified in the following procedures. Such procedures 
and results are by way of example and are not intended to be in any way limited to the scope of the claims. 

IS 

PREPARATIONS 



20 

Strains and Piasmids 

Streptomyces griseus (ATCC 15395) was obtained from the American Type Culture Coilection. 
Streptomyces livldans 66 (Bibb. M.J., J.L. Schottel, and S.N, Cohen (1980). A DNA cloning system for 
'"terspecies gene transfer in antibiotic-producing Stretomyces , Nature 284:526-531) and the piasmids pIJ61 
and plJ702 from the John Innes Institute; Thompson, C.J., T. Kieser, J.M. Ward, and D.A. Hopwood (1982), 
Physical analysis of antibiotic-resistance genes from Streptomyces and their use in vector construction . 
Gene 20:51-62; Katz, E.. C.J. Thompson, and D.A. Hopwood (1983), Cloning and expression of the 
tyrosinase gene from Streptomyces antibioticus in Streptomyces lividans , J. Gen. Microbiol.. 129:2703- 
30 2714). E. coli strain HB101 (ATCC 33694) was used for ail transformations. Piasmids pUC8. pUCiS and 
pUC19 were purchased from Bethesda Research Laboratories. 



Media. Growth and Transformation 

35 

Growth of Streptomyces mycelium for the isolation of DNA or the preparation of protoplasts was as 
described in Hopwood. D.A., M.J. Bibb. K.F. Chater. T. Kieser. CJ. Bruton. H.M. Kieser. D.J. Lydiate. CP. 
Smith. J.M. Ward, and H. Schrempf (1985). Genetic Manipulation of Streptomyces. A Laboratory Manual , 
The John Innes Foundation. Norwich. UK. Protoplasts of S. livldans were prepared by lysozyme treatment. 

40 transformed with plasmid DNA. and selected for resistance to thiostrepton. as described in Hopwood. O.A.. 
M.J. Bibb, K.F. Chater. T. Kieser. CJ. Bruton. H.M. Kieser, D.J. Lydiate. CP. Smith. J.M. Ward, and H. 
Schrempf (1985). Genetic Manipulation of Streptomyces. A Laboratory Manual. The John Innes Foundation, 
Nonwich, UK. Transformants were screened for proteolytic or amylolytic activity on LB plates containing 30 
ug/ml thiostrepton, and either 1% skim milk or 1% corn starch, respectively. E. coli transformants were 

45 grown on YT medium containing 50 ug/ml ampicillin. 



Materials 

50 Oligonucleotides were synthesized using an Applied Biosystem 380A DNA synthesizer. Columns, 
phosphoramidites, and reagents used for oligonucleotide synthesis were obtained from Applied Biosystems. 
Inc. through Technical Marketing Associates. Oligonucleotides were purified by polyacrylamide gel elec- 
trophoresis followed by DEAE cellulose chromatography. Enzymes for digesting and modifying DNA were 
purchased from New England Btolabs and used according to the supplier's recommendations. 

55 Radioisotopes [a-32P]dATP ( 3000 Ci/mmol) and [7-32P]ATP (-3000 Ci/mmol) were from Amersham. 
Thiostrepton was donated by Squibb. 
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EXAMPLE 1 - Isolation of DNA 

Chromosomal DNA was isolated from Streptomyces as described in Chater, K.F., D.A. Hopwood, T. 

Kieser, and C.J. Thomson (1982). Gene cloning in Streptomyces , Curr. Topics Microbiol. Immunol., 96:69- 
5 95. except that sodium dodecyl sarcosinate (final cone. 0.5%) was substituted for sodium dodecyl sulfate. 

Plasmid DNA of transformed S. lividans was prepared by an alkaline lysis procedure as set out in Hopwood. 

D.A., M.J. Bibb. K.F. Chater. T. Kieser. C.J. Bruton, H.M. Kieser. D.J, Lydiate, CP. Smith. J.M. Ward, and H. 

Schrempf (1985). Genetic Manipulation of Streptomyces. A Laboratory Manual . The John Innes Foundation. 

Nonrtfich, UK. Plasmid DNA from E. coH was purified by a rapid boiling method (Holmes, D.S., and M. 
10 Quigley (1981). A rapid boiling method for the preparation of bacterial plasmids . Anal. Biochem.. 114:193- 

197). DNA fragments and vectors used for all constructions were separated by electrophoresis on low 

melting point agarose, and purified from the molten agarose by phenol extraction. 



75 EXAMPLE 2 - Construction of Genomic Library 

Chromosomal DNA of S. griseus ATCC 1 5395 was digested to completion of Barn Hi and fractionated 
by electrophoresis on a 0.8% low melting point agarose gel. DNA fragments ranging in size from 4 to 12 
kilofcase pairs (kb) were Isolated from the agarose gel. The plasmid vectors pUC18 and pUC19 were 
20 digested with BamHI. and treated with calf intestinal alkaline phosphatase (Boehringer Mannheim). The S. 
griseus Bam HI fragments (0.3 ug) and vectors (0,8 ug) were ligated in a final volume of 20 ul as described 
in Maniatis, T.. E.F. Fritsch. and J. Sambrook (1982). Molecular Cloning. A Laboratory Manual , Cold Spring 
Harbor Laboratory. Cold Spring Harbor, NY). Approximately 8000 transformants of HB101 were obtained 
from each ligation reaction. 



E)CAMPLE 3 ■ Subcloning of Protease Gene Fragments 

A hybrid Streptomyces-E. coli vector was constructed by itgating plJ702. which had been linearized by 
30 BamHI, into the BamH I site of pUC8. The unique Bglll site of this vector was used for subcloning Bam HI 
and Bglll fragments fo tiie protease genes. Other fragments were adapted with BamHI linkers to facilitate 
ligation into the Bglll site. The hybrid vector, with pUC8 inserted at the Bam HI site of plJ702. was incapable 
of replicating Streptomyces . However, the & coli plasmid could be . readily removed prior to transforming 
S. lividans by digestion with Bam HI followed by rectrcularization with T4 ligase. 



EXAMPLE 4 - Construction for Testing the sprB Signal Peptide 

The 0.4 kb Sau3AI-Ncol fragment containing the aminoglycoside phosphotransferase gene promoter 
40 was isolated from plJ61 and subcloned into the Bam HI and Nco l sites of a suitable vector. The Ncol site 
containing the initiator ATQ was joined to the Mlu l site of the spr B signal using two 43-mer oligonucleotides, 
which reconstructed the amino-terminus of the signal peptide. An amylase gene of S. griseus was adapted 
by ligating a 14-mer PstI linker to a Smal site in the third codon. This removed the signal peptide and 
restored the amino-terminus of the mature amylase. The Hae ll site of the spr B signal was joined to the PstI 
45 site of the amylase subclone using two 26-mer oligonucleotides, which reconstructed the carboxy-terminus 
of the signal peptide. 

EXAMPLE 5 - Hybridization 

50 ^ 

A 20-mer (5'tTCCC(C/G)AACAACGACTACGG3') oligonucleotide was designed from an amino acid 
sequence (FPNNDYG) which was common to both proteases. For use as a hybridization probe, the 
oligonucleotide was end-labelled using T4 polynucleotide kinase (New England Biolabs) and [-ySaPIATP. 
Digested genomic or plasmid DNA was transfen-ed to a Hybond-N nylon membrane (Amersham) by 
55 electroblotting and hybridized in the presence of formamide (50%) as described In Hopwood. D.A,» M.J. 
Bibb. K.F. Chater, T. Kieser. C.J. Bruton. H.M. Kieser, D.J. Lydiate, CP. Smith, J.M. Ward, and H. Schrempf 
(1985). Genetic Manipulation of Streptomyces, A Laboratory Manual . The John Innes Foundation, Norwich, 
UK. The filters were hybridized with the labelled oligonucleotide probe at 30 *C for 18h, and washed at 
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47* C. The S. griseus genomic library was screened by colony hybridization as described in Wallace. R.B.. 
M.J. Johnson. T. Hirose. T. Miyake, E.H. Kawashima, and K. Itakura 0 981). The use of synthetic 
oligonucleotides as hybridization probes« IL Hybridization of oligonucleotides of mixed sequence to rabbit 
. giobin ONA, Nucl. Acids Res, 9:879-894. 

5 

EXAMPLE 6 - DNA Sequencing 

The sequences of sprA and spr B were determined using a combination of the chemical cleavage 

ro sequencing method (Maxam. A., and W. Gilbert (1977), A new method for sequencing DNA , Proc. Natl. 
Acad. Sci. U.S.A., 74:560-564) and the dideoxy sequencing method (Sanger. F.. S. Nicklen, and A.R. 
Coulson (1977), DNA sequencing with chain terminating inhibitors , Proc. Natl. Acad. Sci. U.S,A., 
74:5463:5467). Restriction fragments were end-labeled using either polynucleotide kinase or the large 
fragment of DNA Polymerase I (Amersham). with the appropriate radiolabeled nucleoside triphosphate. 

rs Labeled fragments were either digested with a second restriction endonuclease or strand-separated., 
followed by electroelution from a poly aery lamide gei. Subclones were prepared in the Ml 3 bacteriophage 
and the dideoxy sequencing reactions were run using the -20 universal primer (New England Bioiabs). In 
some areas of strong secondary structure, compressions and polymerase failure necessitated the use of 
either inosine (Mills. O.R., and F.R. Kramer (1979), Structure independent nucleotide sequence analysis . 

20 Proc. Natl. Acad. Sci. U.S.A., 76:2232-2235) or Z-deazaguanosine (Mizusana, S.. S. Nishimura, and F. Seela 
(1986). Improvement of tjie dideoxy chain termination method of DNA sequencing by use of deoxy-7- 
deazoquanosine triposphate in place of dGTP . Nucleic Acids Res., 14:1319-1324) analogs in the dideoxy 
reactions to clarify the sequence. The sequence were compiled using the software of DNASTARTM - 
(Doggette, P.E., and F.R. Blattner (1986), Personal access of sequence databases on personal computers . 

25 Nucleic Acids Res.. 14:611-619). 

Although preferred embodiments of the invention have been described in detail, it will be understood by 
those skilled in the art that variations may be made thereto without departing from either the spirit of the 
invention or the scope of the appended claims. 

30 

Claims 

1 . A recombinant DNA sequence comprising a DNA signal sequence encoding a signal peptide and a 
DNA gene sequence encoding a protein, said recombinant DNA sequence when expressed in a living ceil 

35 encoding said signal peptide with said protein, said signal peptide directing the secretion of said protein 
from a cell within which said DNA signal sequence is expressed. 

2. A recombinant DNA sequence of claim 1 wherein said DNA signal sequence is adapted for 
expression in a living cell of the genera Streptomyces . 

3. A recombinant DNA sequence of claim 2 wherein said DNA signal sequence encodes a peptide 
40 having a 38-amino acid sequence. 

4. A recombinant DNA sequence of claim 1 , wherein said sequence is inserted in a suitable vector. 

5. A recombinant DNA sequence of claim 4 wherein said vector is a plasmid or phage. 

6. A recombinant DNA sequence of claim 1 wherein said DNA gene sequence encodes an enzyme. 

7. A recombinant DNA sequence of claim 1 wherein said DNA signal sequence is adapted for 
45 expression In a living cell having enzymes catalyzing the formation of disulphide bonds. 

8. A recombinant DNA sequence of claim 7 wherein said enzymes include protein disulphide ox- 
idoreductase. 

9. A biologically pure isolated DNA signal sequence encoding a 38-amino acid signal peptide which 
directs secretion of a recombinant gene protein linked to such 38-amino acid signal peptide from a cell In 

50 which said DNA signal sequence is expressed, said DNA signal sequence being isolated from Strep- 
tomyces. 

10. A biologically pure isolated DNA signal sequence of claim 9 wherein said Streptomyces is 
Streptomyces griseus. 

11. A biologically pure isolated DNA signed sequence of claim 10 wherein said Streptomyces is 
55 Streptomyces griseus strain ATCC 15395. 

12. A biologically pure isolated DNA signal sequence of claim 11 wherein said DNA sequence is 
isolated from either of two genes of said strain encoding, respectively, protease A and protease B. 
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encoding 


a protein 


encoding 


a protein 


encoding 


a protein 


encoding 


a protein 



13. biologically pure isolated DNA signal sequences of claim 12 wherein each of said ON A sequences 
as isolated from each of said two genes has a DNA homology with the other DNA sequence of 
approximately 58%. 

14. A DNA signal sequence of claim 9 in conjunction with a gene sequence encoding a protein wherein 
5 said signal sequence and said gene sequence are inserted in a suitable vector. 

15. A DNA signal sequence of claim 10 in conjunction with a gene sequence 
wherein said signal sequence and said gene sequence are inserted in a suitable vector. 

16. A DNA signal sequence of claim 11 in conjunction with a gene sequence 
wherein said signal sequence and said gene sequence are inserted in a suitable vector. 

10 17. A DNA signal sequence of claim 12 in conjunction with a gene sequence encoding 
wherein said signal sequence and said gene sequence are inserted in a suitable vector. 

18. A DNA signal sequence of claim 13 in conjunction with a gene sequence 
wherein said signal sequence and said gene sequence are inserted in a suitable vector., 

19. A vector of claim 14 or 15 wherein said vector is a plasmid or phage. 

15 20. A vector of claim 16, 17 or 18 wherein said vector Is a plasmid or phage. 

21. A biologically pure isolated DNA signal sequence of claim 9 having the sequence of Figure 4a. 

22. A biologically pure isolated DNA signal sequence of claim 9 having the sequence of Figure 5a. 

23. A fused protein encoded by said recombinant DNA sequence of claim 1 . 

24. A fused protein encoded by said recombinant DNA sequence of claim 2, 
20 25. A fused protein encoded by said recombinant DNA sequence of claim 3. 

26. A fused protein encoded by said recombinant DNA sequence of claim 4. 

27. A fused protein encoded by said recombinant DNA sequence of claim 5. 

28. A fused protein encoded by said recombinant DNA sequence of claim 6. 

29. A fused protein encoded by said recombinant DNA sequence of claim 7. 
26 30. A fused protein encoded by said recombinant DNA sequence of claim 8. 

31. A fused protein of claim 23 comprising the amino acid sequence of Figure 4a in conjunction with an 
amino acid sequence of a protein which is normally foreign to the living cell In which said fused protein is 
formed. 

32. A fused protein of claim 23 comprising the amino acid sequence of Figure 5a in conjunction with an 
30 amino acid sequence of a protein which Is normally foreign to the living cell in which said fused protein is 

formed. 

33. A transfonmed prokaryotic cell comprising a vector of claim 1 4 inserted into said cell. 

34. A transformed prokaryotic cell comprising a vector of claim 15 inserted i 

35. A transformed prokaryotic cell comprising a vector of claim 1 6 inserted i 
35 36. A transformed prokaryotic cell comprising a vector of claim 17 inserted 

37. A transformed prokaryotic cell comprising a vector of claim 18 inserted into said cell. 

38. A transformed prokaryotic cell of claim 33 or 34 wherein said vector is a plasmid or phage. 

39. A transformed prokaryotic cell of claim 35, 36 or 37 wherein said vector is a plasmid or phage. 

40. A transformed prokaryotic cell of claim 33 or 34 wherein said cell is selected from the Streptomyces 
40 genera. 

41. A transformed prokaryotic cell of claim 35, 36 or 37 wherein said cell is selected from the 
Streptomyces genera. 

42. A biologically pure culture of a transformed prokaryotic cell, said cell being transformed with a 
recombinant DNA sequence of claim 1 being inserted into said cell In suitable vector, said culture producing 

45 the fused protein of said amino acid signal peptide and said protein as a short-lived intermediate and, said 
culture producing the said protein in a recoverable quantity upon fermentation in an aqueous nutrient 
medium, said signal peptide directing the secretion of said protein from said cell. 

43. A biologically pure culture of claim 42 wherein said prokaryotic cell contains enzymes catalyzing the 
formation of disulphrde bonds. 

50 44, A biologically pure culture of claim 43 wherein said enzymes include protein disulphide oxidoreduc- 
tase. 

45. A biologically pure culture of claim 42 wherein said prokaryotic 
Streptomyces genera. 

46. A biologically 
55 Streptomyces genera. 

47. A biologically 
Streptomyces genera. 



nto said cell, 
nto said cell, 
into said cell. 
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48. A biologically pure culture of claim 45 wherein said prokaryotic cell Is selected from Streptomyces 
lividans. 

49. A biologically pure culture of claim 46 wherein said prokaryotic cell is selected from Streptomyces 
lividans ^ 

5 50. A biologically pure culture of claim 47 wherein said prokaryotic cell is selected from Streptomyces 
lividans . 

51. A biologically pure DNA sequence isolated from Streptomyces griseus encoding for a fused protein 
of signal peptide-propeptide-protease A structure, said DNA sequence having the combined DNA sequence 
of Figures 4a, 4b and 4c. 

10 52. A biologically pure DNA sequence isolated from Streptomyces griseus encoding for a fused protein 
of signal peptide-propeptide-protease B structure, said DNA sequence having the combined DNA sequence 
of Figures 5a, 5b and 5c. 
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A 

6TCGACCCCCATCTCATTCCGG6CTCGCGGGCGCGAATCCG6CCTTGCGTCAGGGACG6TCCCCGTCAACGATTC 

CAGCGTGCAACTT6GCA66TTCACGCCCACTCCCACTGGGTGA6AACCTC6C6CACCAAC66CCCCACCTCACCC 

-116 

MTFKRFSPLSSTSR 

GACCG6GCCGTCCCCCCATACCTCGGAGGATCTCGTGACCTTCAAGCGCTTCTCGCCGCTCAGCAGCACGTCAAG 
-too -80 i 

YARLLAVASGLVAAAALATPSAVAA 
ATAT6CACG6CTCCTCGCCGTG6CCTCCG6CCTGGT6GCCGCC6C6GCCCTGGCCACCCCCTCG6CCGTCGCCGC 

-60 

PEAESKATV5QLADA5SAILAADVA 
TCCC6AGGCGGAGTCCAAGGCCACCGTTTCGCAGCTC6CC6AC6CCAGCTCCGCCATCCTCGCCGCTGATGTGGC 

-40 

GTAWYTEASTGKIVLTADSTVSKAE 
GGGCACCGCCT6GTACACGGAGGCGAGCACGGGCAAGATC6TCCTCACCGCCGACAGCACC6TGTCGAAGGCCGA 
-20 

LAKVSNALAGSKAKL TVKRAEGKFT 
ACTG6CCAAGGTCAGCAAC6CGCTGGCG6GCTCCAAGGCGAAACTGAC6GTCAAGCGCGCCGAGGGCAAGTTCAC 

20 

PLIA66EAITTG6SRCSLGFNVSVN 
CCC6CTGATCGC6GGCG6CGAGGCCATCACCACCGGTG6CAGCCGCTGTTC8CTCGGCTTCAACGTGTCG6TCAA 

40 

GVAHALTAGHCTNISASWSIGTRT6 
CGGCGTCGCCCACGCGCTCACCGCCGGCCACTGCACCAACATCAGCGCCAGCTGGTCCATCGGCACGCGCACCGG 

60 

TSFPNNDYGI IRHSNPAAAD6RVYL 
AACCAGCTTCCCGAACAACGACTACGGCATCATCC6CCACTCGAACCCGGC66CGGCCGAC6GCCGG6TCTACCT 
80 

YN6SYQDITTAGNAFV6QAVQRS6S 
GTACAACGGCTCCTACCAGGACATCACGACGGCGGGCAACGCCTTTGTGGGGCAGGCCGTCCAGCGCAGCGGCAG 
100 120 
TT6LRSGSVT6LNATVNYGSSGIVY 
CACCACCGGGCT6CGCAGCGGCTCGGTCACCGGCCTCAACGCCAC66TCAACTACGGTTCCAGC6G6ATCGTGTA 

140 

6MIQTNVCAEPGDS6GSLFAGSTAL 
CGGCATGATCCAGACCAACGTCTGTGCCGA6CCCGGTGACAGTG6AG6CTC6CTCTTCGCG6GCAGCACC6CTCT 

160 

GLTSG6S6NCRTGGTTFYQPVTEAL 
GGGTCTCACCTCC6GC6GCAGTG6CAACTGCCGGACCGGCGGCACCACGTTCTACCAGCCC6TCACCGA6GCGCT 

SAYGATVL* -l^h^ 

GAGCGCCTACGG6GCAACGGTCCTGTAGCC6GTGCCACCGGG6CTTC6G6CTGACCGCC6ACC6GCC6CCC6AAG 

r^Q-§ 

CCCCGC6CGACGCCCCACCCCG6CGGACCGTGCTCGCGCGCGGTCCGCCCTCGCCGT6CCACGAACCCCACCGTC 
CTnCCCCGTCAGGCGCCTGCCGCTCGACCCGCATCGCGAAGTTGCCGAGAGTGGCCGGCTCGCACCGGCACTGC 
TGAAGTCCTGCCCTCGCCCCACGGTCCGGnCGCGCCCGCCCGGACGCGGACCCGCGCCTGGGGAAGCCCTCACT 
CAACCCCGTTGC6CGCGGATGAGGTCGCGATACCAGGCGAA6GAG6CCTTCGG6GTGCGGACCTGT6TCTC6TGG 



TC6AC 
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-60 

PEAESKATVSQLADASSAILAADVA 
TCCC6A66C6GAGTCCAAGGCCACC6nTCGCA6CTCGCCGACGCCA6CTCC6CCATCCTCGCCGCTGATGTGGC 

-40 

GTAWYTEASTGKIVLTADSTVSKAE 
GGGCACCGCCTG6TACACGGAGGCGAGCAC6GGCAAGATCGTCCTCACCGCCGACAGCACCGTGTCGAAGGCCGA 
-20 

LAKVSNALASSKAKLTVKRAEGKFT 
ACTGGCCAAGGTCAGCAA CGC6CTGGCGGGCTCCAAG6CGAAACTGACGGTCAAGCGCGCCGAGGGCAAGTTCAC 

P L* 

cccgctgI 
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UlAGGEAITTGGSRCSLGFNVSVN 
MCGCGGGCGGCGAGGCCATCACCACCGGTGGCAGCCGCTGTTCGCTCGGCTTCAACGTGTCGGTCAA 

40 

6VAHALTAGHCTN I SASWSIGTRTG 
CGGCGTCGCCCACGCGCTCACCGCCG6CCACTGCACCAACA7CAGCGCCA6CTG6TCCATCG6CACGC6CACC6G 

60 

TSFPNNDYGIIRHSNPAAADGRVYL 
AACCA6CTTCCCGAACAACGACTAC66CATCATCCGCCACTCGAACCCGGC6GC6GCCGACGGCCGGGTCTACCT 
80 

YMGSYQ0ITTAGNAFVGQAVQRS6S 
GTACAAC6GCTCCTACCAGGACATCAC6AC66C6GGCAACGCCTTTGTGGGGCAG6CC6TCCAGCGCAGCGGCAG 

100 120 
TT6LRSGSVT6LNATVNY6SS6IVY 
CACCACC6G6CTGC6CAGCG6CTC66TCACCGGCCTCAAC6CCACGGTCAACTACGGTTCCAGCG6GATCGTGTA 

140 

GMIQTNVCAEPGDSGGSLF. AGSTAL 
CGGCATGATCCAGACCAACGTCTGTGCCGA6CCCGGTGACAGTGGAGGCTCGCTCTTCGCGG6CAGCACCGCTCT 

160 

GLTSGGSGNCRTGGTTFYQPVTEAL 
6GGTCTCACCTCCGGCG6CAGT6GCAACTGCCGGACCGGCGGCACCACGTTCTACCAGCCCGTCACCGAGGCGCT 



181 

SAYGATVL* 
GAGCGCCTACGGGGCAACGGTCCIGB 
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B 

CGCTGTGCCC6CCGT6CGCCTTC6CCGATCACTTCATCTGCCCSTTCCCGCCCCC666CAACACGCTC6CC6C66 



CGGTTTTG6CGGG6GAGC6GAACCGGATCGACGCCTGACCCGC6CGAG6CCCCACCGGCCCCGGCA6CC6CACGG 
CTCCCG6G6CCGGTGAC6GAT6T6ACCCGCGTG6CCGAAA66CATTCTTGCGTCCCCC6TCC6GCCCCCTC6ATA 

CTCCGGTCA6CGATTGTCAGGG6CACG6CGAATTCGAAATCCGGACA6GCCCCCGACT6CGCCTCAC666CCC6C 

MRI KRTSNRSN 
CACCCCACAG6AGGGCCCCC6AnCCCCTCGGAG6AACCCGAAfiTGAGGATCAAGCGCACCAGCMCCGCTC6AA 
-\oo 

AARRVRTT AVLAGLAAVAALAVPTA 
CGCGGCGAGACGCGTCCGCACCACCGCCGTACTCGCGGGGCTC6CC6CCGTCGCG6CGCTGGCCGTTCCCACCGC 

N A E TPRTFSANQLTAASDAVL6A0I 
GAACGCC6AAACCCCCCaGACGnCAGTGCCAACCA6CTGACCGC6GCGA6CGACGCC6TGCTCGGCGCCGACAT 



AGTAMNIOPQSKRLVVTVDSTVSKA 
CGCGG6CACC6CCTGGAACATCGACCCGCAGTCCAAGCGCCTCGTCGTCACCGTCGACA6CAC6GTCTC6AA66C 

E I N Q I K K S~^A GAMADALR1ERTP6KF 
MAGATCAACCAGATCAAGAAGTC66CG6GCGCCAACGCCGACGCSCTGCGGATCGAGCGCACCCCCG^6TT 

TKL*ISGGDAIYSSTGRCSLGFNVRS 
CACCAAGCTGATCTCCGGCGGCGACGC6ATCTACTCCA6CACCSSACGCT6CTC6CTC66CTTCAACGTCCGCAG 

AO 

GSTYYFLTAGHCTD6ATTWMANSAR 
CGGCAGCACCTACTACnCCTGACCGCC6GCCACT6CACGGACGGCGCGACCACCTG6TGGGCGAACTCGGCCCG 

T T V L6TTSGSSFPNNDY6IVRYTNT 
CACCACGGTGCTCGGCACGACCTCCGGGTCGAGCnCCCGAACAACGACTACGGCATCGTGCGCTACACCAACAC 

TIPKDGT^GGQDITSAANATVGMAV 
CACCATTCCCAAGGACGGCAC66TC66C6GCCA66ACATCACCA6C6CCGCCAAC6CCACC6TC66CAT6GCG6T 
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TR RGSTTGTHS6SVTALNATVNY6G 
CACCCGCCGCGGCTCCACCACCGGCACCCACASCGGnCGGTCACCGCACTCAACGCCACCGTCAACTACGGGGG 
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GDVVYGMIRTNVCAEPGOSGGPLYS 
CG6CGAC6TCGTCTAC66CATGATCCGCACCAAC6T6TGCGC6GAGCCCGGC6ACTCCGGCGGCCC6CTCTACTC 

GTRAIGLTSGGSGNCSSG6TTFFQP 
CGGCACCCGGGC6ATCGGTCTGACCTCCGGCGGCAGCGGCAACTGCTCCTCC6GC6GCACGACCTTCTTCCAGCC 

GST(yVCC6A6BCGCT64c6CGTACGl:GTCAlcST6TACTGACC^ 

-4o.a ■ 

CGTACAAACGTGCCCCCGTCCGGAATTCCGGACGGGGGCTCCCGCTCGCCGGGGAGCTCTTGAGAGGATGTCGCC 
ACGACGGGTCGCCGCTGCGC6TC 
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FIG.5B. 
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FIG.5C. 
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